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QO ' The primal-dual scheme has been used to provide approximation algorithms for many problems. 

Goemans and Williamson gave a (2 — : ^ : y)-approximation for the Prize-Collecting Steiner Tree 
Problem that runs in 0(n 3 logn) time. Johnson, Minkoff and Phillips proposed a faster implc- 
• mentation of Goemans and Williamson's algorithm. We give a proof that the approximation 

^"*^ \ ratio of this implementation is exactly 2. 
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1 Introduction 

CM ' 

Consider a graph G = (V, E), a function c from E into the set Q> of non-negative rationals and a 
function ir from V into Q>. The Prize-Collecting Steiner Tree Problem (PCST) asks for a 
tree T in G such that X^s-Et Ce ^2vev\v T 7Tv ^ s minimum. (We denote by Vr and Et, respectively, 
the vertex and edge sets of a graph T.) The rooted variant of the problem requires T to contain a 
given root vertex. 



o . 

Goemans and Williamson [21 [3] used a primal-dual scheme to derive a (2 ^^-approximation 

for the rooted variant of PCST, where n := \V\. By trying all possible choices for the root, they 
obtained a (2 — ^^-approximation for the unrooted PCST. The resulting algorithm runs in time 
^ ■ 0(ra 3 logra). Johnson, Minkoff and Phillips [lj proposed a modification of the algorithm that runs 

the primal-dual scheme only once, resulting in a running-time of 0(n 2 logn). They claimed their 
algorithm — which we refer to as JMP — achieves an approximation ratio of 2— ^-j- . Unfortunately, 
their claim does not hold. 

This note does two things. First, it proves that the JMP algorithm is a 2-approximation 
(the proof involves some non-trivial technical details). Second, it shows an example where the 
approximation ratio achieved by the JMP algorithm is exactly 2, thereby contradicting the claim 
by Johnson, Minkoff and Phillips. 



^This paper was originally published as |http : //www, ime .usp.br/-cris/publ/jmp-aiialysis .ps .gz| in 
2006. The present version makes explicit a stronger statement, implicit in the original version: that the addressed 
implementation is a Lagrangean preserving 2-approximation. It also introduces some cosmetic changes in notation 
and corrects a technical error in the proof of one of the invariants. 

'Departamento de Ciencia da Computacao, Instituto de Matematica e Estati'stica, Universidade de Sao Paulo, Rua 
do Matao 1010, 05508-090 Sao Paulo/SP, Brazil. E-mail: {pf , cris , cef , coelho}@ime .usp .br. Research supported 
in part by PRONEX/CNPq 664107/1997-4 (Brazil). 
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2 Notation and preliminaries 

For any subset F of E, let c(F) := X^eeF c e- For any subset X of V, let n(X) := YlveX n v and let 
X := V \ X. If T is a subgraph of G, we shall abuse notation and write vr(T) and tt(T) to mean 
7t(Vt) and tt(Vt) respectively. Similarly, we shall write c(T) to mean c{Ex)- Hence, the goal of 
PCST(G, c, 7r) is to find a tree T in G such that c(T) + 7r(T) is minimum. 

A collection £ of nonnull subsets of V is laminar if, for any two elements L\ and L2 of £, 
either L\ n L2 = or Li C L2 or Li D L2. For any subset X of V", let 

C[X] :={LeC:LCX} and £ x :={Le£:LDI}. 

For every L in £ that is not in £[X] U£[X] U Cx, the sets LnX, L\X and X\L are all nonempty. 
For any subgraph T of G, we shall abuse notation and write C[T], £[T], and £t in place of £[Vr], 
£[Vt], and Cy T respectively. 

The union of all sets in £ shall be denoted by (J £. The set of all maximal elements of £ shall 
be denoted by £*. If C is laminar, the elements of C* are pairwise disjoint. If, in addition, |J C = V 
then C* is a partition of V. 

For any laminar collection C of subsets of V and any edge e of G, let £(e) := {L £ C : e E 5g^} 5 
where <5gL stands for the set of edges of G with one end in L and the other in L. 

Let y be a function from £ into Q>. For any subcollection £' of £, let y(£') := X^Le£' ^L - ^ e 
say that y respects c if 

y(£(e)) < c e for each e in E . (1) 
We say an edge e is tight for y if equality holds in ([T]). We say y respects ir if 

y{C[X]) < tt(X) for each X in £ . (2) 

We shall say that y saturates an element X of £ if equality holds in (|2|). The following lemma 
summarizes the effect of the two "respects" constraints on y: 

Lemma 2.1 Let £ be a laminar collection of subsets ofV and y a function from £ into Q>. If y 
respects c and ir then 

y(C\C T ) < c(T) + tt(T) 

for any connected subgraph T of G. 

Proof. For M := {L G £ : <5 T L ^ 0}, we have y(M) < J2 L£M \S T L\y L = E eG £ T 2/(^(e))_< 
EeeE T Ce = c(T). For M := £[T], we have y{J\f) = £ ieA/ -, y (£[£]) < £ ieA r* ?r(L) < tt(T). 
The lemma follows from the two inequalities since £ = M U A/" U £t- ■ 

Let opt(PCST(G, c, 7r)) denote the minimum value of the sum c(T) + vr(T) when T is a tree 
in G. Then the following corollary establishes the relevant lower bound for opt(PCST(G, c, 7r)): 

Corollary 2.2 Let £ be a laminar collection of subsets of V and y a function from £ into Q>. 
If y respects c and ir then y(C \ Co) < opt(PCST(G, c, 7r)) for any optimal solution O of 
PCST(G,c,tt). ■ 
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Before we state the algorithm, a few more definitions are needed. Let £ be a laminar collection 
of subsets of V such that (J C = V. We say that an edge is internal to C* if both of its ends are in 
the same element of C* . All other edges are external to C* . For any external edge, there are two 
elements of C* containing its ends. We call these two elements the extremes of the edge in C* . 

Given a forest F in G and a subset L of V, we say that F is L-connected if Vf H L = or 
the induced subgraph F[Vp n L] is connected. In other words, F is L-connected if the following 
property holds: for any two vertices x and y of F in L, there exists a path from x to y in F and that 
path never leaves L. If F spans G (as is the case during the first phase of the algorithm below), 
the condition ll F[Vp n L] is connected" can, of course, be replaced by "L[L] is connected". 

For any collection C of subsets of V, we shall say that F is /^-connected if F is L-connected 
for each L in C. 

For any collection S of subsets of V, we say a tree T has no bridge in S if \&rS\ 7^ 1 (whence 
^tS 1 = or I^tSI > 2) for all S 1 in S. We say that a tree T in G is wrapped in S if Vt Q S for 
some S in 5. 

3 Johnson, Minkoff and Phillips' algorithm 

The JMP algorithm is a 2-approximation for the POST. It receives G, c, ir and returns a tree T 
in G such that c(T) + 2-7r(T) < 2 opt(PCST(G, c, 7r)). For our purposes, it would be enough to 
have c(T) + vr(T) on the left side of the inequality. The factor 2 multiplying ir is a bonus, and, 
because of it, the JMP algorithm is said to be a Lagrangean preserving 2-approximation [lj. 
The algorithm has two phases, the second one operating on the output of the first. 

Phase I: Each iteration in phase I starts with a spanning forest F in G, a laminar collection L 
of subsets of V such that |J C = V, a subcollection S of C, and a function y from £ into Q> such 
that the following invariants hold: 

(11) F is /^-connected; 

(12) y respects c and ir; 

(13) each edge of F is tight for y; 

(14) y saturates every element of S; 

(15) no element of C* \ S is the union of elements of 5; 

(16) for any /^-connected tree T in G, if T has no bridge in S and is not wrapped in <S then 



The first iteration starts with F = (V,0), C = {{v} : v € V}, 5 = 0, and y = 0. Each iteration 
consists of the following: 

Case 1.1: |£*\«S|>1. 

For e in Q>, let y e be the function defined as follows: y £ L = y L + e if L G C* \ S and 
y £ L = y L otherwise. Let e be the largest number in Q> such that the function y e respects 




(3) 



e&E T 



for any vertex o of G. 



c and 7r. 



3 



Subcase I.l.A: y £ saturates some element L of C* \ S. 

Start a new iteration with S U {L} and y £ in the roles of S and y respectively. 

(The forest F and the collection C do not change.) 

Subcase I.l.B: some edge e external to C* is tight for y £ and has at least 
one of its extremes in C* \ S. 

Let L\ and L2 be the extremes of e in £*. Set J/f, lLL j, 2 := and start a 
new iteration with F + e, £ U {Li U L2}, and y e in the roles of F, C, and y 
respectively. (The collection S does not change.) 

Case 1.2: |£*\<S| = 1. 

This is the end of phase I. Start phase II. 

Phase II: During this phase, the collections C and S and the function y remain unchanged. Let 
M be the only element of C* \ S. Each iteration begins with a subgraph T of F such that 

(17) T is an ^-connected tree; 

(18) M\Vt admits a partition into elements of S. 

The first iteration begins with T = F[M]. Each iteration does the following: 

Case II. 1: \5tZ\ = 1 for some Z in S. 

Start a new iteration with T — Z in place of T. 

Case II.2: \5 T Z\ ^ 1 for each Z in S. 
Return T and stop. 

4 Analysis of the algorithm 

Suppose, for the moment, that invariants (il) to (i8) are correct. At the end of phase II, T is a 
tree by virtue of (i7). As T is a subgraph of F, due to (i3), 

c(T) =^c e =^ y(C(e)) . 

e£E T e£E T 

On the other hand, C* n S is a partition of M and, by (i8), there is a partition of M \ Vr into 
elements of S. Therefore, some subcollection Z of S is a partition of Vr- Hence, 

7r(T) = ^vr(5) = ^ y (£[5])< y (£[r]). 

Here, the second equality follows from (i4). Therefore, 

c{T) + 2vr(T) < ^(e)) + 2y(£[T]) . (4) 

e€E T 

In order to show that ([3]) holds, we must verify that T satisfies the hypotheses of (i6). By (i7), T 
is /^-connected. Due to (i5), M is not the union of elements of S. Hence, by virtue (i8), T is not 



4 



wrapped in S. Since we are in Case II. 2, T has no bridge in S. Hence, T satisfies the hypotheses 
of (i6). Now, by © coupled with Q, 

c(T) + 2tt(T) <2y(£\£ {o} ) (5) 

for any vertex o. Now, let o be an arbitrary vertex of an optimal solution O of PCST(G, c, ir). 
Since y respects c and it, as stated in (i2), Corollary 12.21 implies 

c(T) + 2tt(T) < 2y(£\£ {o} ) < 2y(£\£ ) < 2 opt(PCST(G, c, ir)) . 

This proves the following theorem (which is the correct version of Theorem 3.2 by Johnson, Minkoff 
and Phillips @]): 

Theorem 4.1 The JMP algorithm is a Lagrangean preserving 2 -approximation for the PCST. 

To complete the proof of the theorem we must only verify the invariants of the algorithm, 
something we shall do in the next section. 

The example in Figure [1] shows that the approximation ratio of the JMP algorithm can be 
arbitrarily close to 2, regardless of the size of the graph. So, Theorem 14.11 is tight. 
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Figure 1: (a) An instance of the PCST. (b) The solution produced by the JMP algorithm when 
p > 0. Its cost is 4. (c) The optimal solution, consisting of vertex u alone, has cost 2 + p. (d) A 
similar instance of arbitrary size consists of a long path. 



5 Proofs of the invariants 

Invariants (il) to (i4) obviously hold at the beginning of each iteration of phase I. We must only 
verify the other four invariants. 

Proof of (i5). Obviously (i5) holds at the beginning of the first iteration. Now consider 
an iteration where Case 1.1 occurs. If Subcase I.l.A occurs, then (i5) remains trivially true at 
the beginning of the next iteration. Next, suppose Subcase I.l.B occurs. Adjust notation so 
that L\ ^ S. Since (i5) holds at the beginning of the current iteration, L\ is not the union of 
elements of S. Hence, L\ U L2 is not the union of elements of S. Therefore, (i5) remains trivially 
true at the beginning of the next iteration. ■ 

The verification of (i6) depends on the following lemma: 
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Lemma 5.1 Let V be a partition of V and (A, B) a bipartition ofV. Let T be a tree in G. If T 
is V -connected, has no bridge in B, and is not wrapped in B, then 

\ J2\5tA\ + \Am\ < \A\ -1. (6) 

AeA 

Proof. Let us say that two elements of V are adjacent if there is an edge of T with these two 
elements as extremes. This adjacency relation defines a graph % having V as set of vertices. Since 
T is "P-connected, the edges of H are in one-to-one correspondence with the edges of T external 
to V . Hence, the degree of any vertex P of % is exactly |<5t-P|, and therefore \ Spg-p \^tP\ = 
Since T is connected, H has 1 + ("PfT]! components (all are singletons, except at most one). Since 
T has no cycles and is P-connected, T~L is a forest. Hence \E^\ = \V\ — 1 — ^[T]) and therefore 

l -Y J \^p\ = \v\-i-\vm\- (7) 

Pev 

Now consider the vertices of T~L that are in B. Since T has no bridge in B and is not wrapped 
in B, each B in B is such that either \6 T B\ > 2 or B C Vt- Hence Y^BeB \ S T B \ > 2 \B \ B[T]\, and 
therefore 

\Y,\*tB\ > \B\-\B[T}\. (8) 

BeB 

The difference between (|7|) and ([8]) is the claimed inequality ©. ■ 

Proof of (i6). It is clear that (i6) holds at the beginning of the first iteration. Now assume 
that it holds at the beginning of some iteration where Case 1.1 occurs. 

Suppose, first, that Subcase I.l.A occurs. At the end of the subcase, let S' := S U {L}, let o 
be any vertex, and let T be an /^-connected tree that has no bridge in S', is not wrapped in S' , 
and such that all its edges are tight for y e . Of course all edges of T are tight for y. Since T has 
no bridge in S and is not wrapped in 5, ([3]) holds. We must show that (J3]) also holds when y £ is 
substituted for y. Let V := £*, A := C* \ S, and B := C*nS. Since \A{ Q y \ < 1, Lemma I5TT1 implies 

J2\5 T A\e + 2\A[T]\e < 2|^\^ {o} |e. 

The addition of this inequality to ([3]) produces 

y £ {C(e)) + 2tf{C[T]) < 2f(C\C {o} ), 

e<=E T 

since y e differs from y only in A. Hence, (i6) remains true at the beginning of the next iteration. 

Now suppose Subcase I.l.B occurs. At the end of the subcase, let £ := C U {L\ U L2}, let 
o be any vertex, and let T be an /^'-connected tree that has no bridge in S and is not wrapped 
in S. Since T is /^-connected, ([3]) holds. We must show that ([3]) remains true when y £ and £ are 
substituted for y and C respectively. Let V := £*, A := C* \ S, and B := C* n<S. Since |^{ D }| < 1, 
Lemma [5TT1 implies Y^AeA \$tA\ e + 2 \ A[T] \ e < 2 |^4\^4{ }| e, as in the previous case. The addition 
of this inequality to © produces 

£ f(C'(e)) + 2y%C'[T]) < 2 f (£' \ £> {o} ) , 

e£E T 
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since UL = and y £ differs from y only in A. Hence, (i6) remains true at the beginning of the 
next iteration. ■ 

Proof of (i7). Suppose we are at the beginning of the first iteration of phase II. Let L be 
an element of C such that L fl Vt ^ 0. Since Vt = M € C* , we have L C V T and therefore 
T[Vt n L] = T[L] = F[L]. Since F[L] is connected by virtue of (il), so is T[V T nL]. This argument 
shows that T is /^-connected. In particular, T is M-connected and therefore T is a tree. Hence, 
(i7) holds at the beginning of the first iteration. 

Now suppose (i7) holds at the beginning of some iteration where Case II. 1 occurs. Let L be an 
element of C and let u and v be vertices in L n (Vp \ Z) . Let P be the unique path from u to v in T. 
We may assume that P never leaves L. Moreover, P never enters Z, given that \5tZ\ = 1. Hence, 
T — Z is L-connected. For the same reason, T — Z is a tree. Hence (i7) holds at the beginning of 
the next iteration. ■ 

Proof of (i8). At the beginning of the first iteration of phase II, (i8) holds because Vt = M. 
Now consider an iteration where Case II. 1 occurs. We may assume that there is a partition U of 
M \ Vt into elements of S. If Z C Vt then IA U {Z} is a partition of M \ (Vt \ Z) into elements 
of S. Otherwise, Z includes some of the elements of U and is disjoint from all the others. Hence, 
{Z} U {U G U : U n Z = 0} is a partition of M \ (V T \ Z) into elements of S. This shows that (i8) 
holds at the beginning of the next iteration. ■ 
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